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To accurately detenninc rate and voice activity in 
moderate-to-low signal-to-noisc ratios (item 703) (SNRs) 
to maximize voice quality, system capacity and/or battery 
life, parameters from a noise suppression system are used as 
inputs to the rate determination function. Voice metrics are 
compared to thresholds (item 715) and rates are determined 
(items 721. 727. 730). 
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APPARATUS AND METHOD FOR RATE DETERMINATION 
IN A COMMUNICATION SYSTEM 

5 FIELD OF THE INVENTION 

The present invention relates generally to rate 
determination and, more particularly, to rate determination in 
communication systems. 

10 

BACKGROUND OF THE INVENTION 

In variable rate vocoders systems, such as IS-96, IS-127 

1 5 (EVRC), and CDG-27, there remains the problem of distinguishing 
between voice and background noise in moderate to low signal-to- 
noise ratio (SNR) environments. The problem is that if the Rate 
Determination Algorithm (RDA) is too sensitive, the average data 
rate will be too high since much of the background noise will be 

20 coded at Rate 1/2 or Rate 1. This will result in a loss of capacity in 
code division multiple access (CDMA) systems. Conversely, if the 
RDA is set too "lean", low level speech signals will remain buried 
in moderate levels of noise and coded at Rate 1/8. This will result 
in degraded speech quality due to lower intelligibility. 

25 Although the RDA's in the EVRC and CDG-27 have been 

improved since IS-96, recent testing by the CDMA Development 
Group (CDG) has indicated that there is still a problem in car noise 
environments where the SNR is 10 dB or less. This level of SNR 
may seem extreme, but in hands-free mobile situations this should 

30 be considered a nominal level. Fixed-rate vocoders in time 
division multiple access (TDMA) mobile units can also be faced 
with similar problems when using discontinuous transmission 
(DTX) to prolong battery life. In this scenario, a Voice Activity 
Detector (VAD) determines whether or not the transmit power 
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2 

amplifier is activated, so the tradeoff becomes voice quality versus 
battery life. 

Thus, a need exists for an improved apparatus and method for 
rate determination in communication systems. 

5 

Brief description of the drawings 

FIG. 1 generally depicts a communication system which 
10 beneficially implements improved rate determination in 
accordance with the invention. 

FIG. 2 generally depicts a block diagram of an. apparatus 
useful in implementing rate determination in accordance with the 
invention. 

1 5 FIG. 3 generally depicts frame-to-frame overlap which occurs 

in the noise suppression system of FIG. 2. 

FIG. 4 generally depicts trapezoidal windowing of 
preemphasized samples which occurs in the noise suppression 
system of FIG. 2. 

20 FIG. 5 generally depicts a block diagram of the spectral 

deviation estimator within the noise suppression system depicted 
in FIG. 2. 

FIG. 6 generally depicts a flow diagram of the steps 
performed in the update decision determiner within the noise 
25 suppression system depicted in FIG. 2. 

FIG. 7 generally depicts a flow diagram of the steps 
performed by the rate determination block of FIG. 2 to determine 
transmission rate in accordance with the invention. 

FIG. 8 generally depicts a flow diagram of the steps 
30 performed by a voice activity detector to determine the presence of 
voice activity in accordance with the invention. 
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DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT 

To accurately determine rate and voice activity in moderate- 
to-low srgnal-to-noise ratios (SNRs) to maximize voice quality, 
5 system capacity and/or battery life, parameters from a noise 
suppression system are used as inputs to the rate determination 
function. Using this method, more of the speech is extracted from 
the backgroimd noise and a lower number of false onsets during 
fluctuating noise conditions compared with conventional systems 

1 0 are detected. The method is beneficial for voice activity detection 
(VAD) as well as rate determination (RDA) and unlike other 
RDA/VAD implementations, is independent of the type of speech 
coder employed (IS-127, CDG-27, 15-96 and GSM). 

Stated generally, an apparatus for determining transmission 

15 rate in a communication system comprises a noise suppression 
system for suppressing background noise in a signal input to the 
noise suppression system, the noise suppression system generating 
parameters related to the suppression of the background noise and 
a rate determination means, having as input the. parameters 

20 generated by the noise suppression system, for generating 
transmission rate information for use by a speech coder. In the 
preferred embodiment, the noise suppression system is 
substantially a noise suppression system as defined in IS-127 and 
the parameters generated by the noise suppression system include a 

25 control signal which allows the noise suppression system to 
recover when a sudden increase in background noise causes the 
noise suppression system to erroneously misclassify background 
noise. 

Stated more specifically, the apparatus for determining 
* 30 transmission rate in a communication system comprises means for 
estimating the channel energy in a current frame of information 
and means, having as input the estimated channel energy, for 
determining the difference between the estimated channel energy 
for the current frame of information and the energy of a plurality 
3 5 of past frames of information to produce a total channel energy 
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estimate for the current frame. A means for determining a voice 
metric then determines the voice metric based on estimates of 
signal-to-noise ratio of the current frame of information and a 
means for producing a total estimated noise energy based on the 
5 estimated chaimel energy. Based on the total channel energy 
estimate for the current frame, the voice metric and the total 
estimated noise energy, a means for determining the rate of 
transmission determines the transmission rate of the frame of 
information. 

10 In this embodiment, the apparatus further comprises a 

means, having as input the total channel energy estimate for the 
current frame of information, a peak-to-average ratio of the current 
frame of information, a spectral deviation between the current 
frame and past frames and the voice metric, for producing a control 
15 signal which prevents a noise estimate from being updated when 
certain types of signals are preseul. ivlore specifecaUy, the ccntrcl 
signal prevents a noise estimate from being updated when tonal 
signals are present which allows sinewaves to be transmitted at full 
rate for purposes of testing the communication system. 
20 The steps performed by the apparatus in accordance with the 

invention include determining a first voice metric threshold from 
a peak signal-to-noise ratio of a current frame of information and 
comparing a voice metric to the first voice metric threshold. 
When the voice metric is less than the first voice metric threshold, 
25 the frame of information is transmitted at a first rate. When the 
voice metric is greater than the first voice metric threshold, the 
voice metric is compared to a second voice metric threshold. 
When the voice metric is less than the second voice metric 
threshold, the frame of information is transmitted at a second rate, 
30 otherwise the frame of information is transmitted at a third rate. 

The communication system implementing such steps is a 
code-division multiple access (CDMA) communication system as 
defined in IS-95. As defined in IS-95, the first rate comprises 1/8 
rate, the second rate comprises 1/2 rate and the third rate comprises 
35 full rate of the CDMA communication system. In this 
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embodiment, the second voice metric threshold is a scaled version 
of the first voice metric threshold and a hangover is implemented 
after transmission at either the second or third rate. 

The peak signal-to-noise ratio of a current frame of 

. 5 information in this embodiment comprises a quantized peak 
signal-to-noise ratio of a current frame of information. As such, 
the step of determining a voice metric threshold from the 
quantized peak signal-to-noise ratio of a current frame of 
information further comprises the steps of calculating a total 

1 0 signal-to-noise ratio for the current frame of information and 
estimating a peak signal-to-noise ratio based on the calculated total 
signal-to-noise ratio for the current frame of information. The 
peak signal-to-noise ratio of the current frame of information is 
then quantized to determine the voice metric threshold. 

1 5 The communication system can likev^ise be a time-division 

multiple access (TDMA) communication system such as the GSM 
TDMA communication system. The method in this case 
determines that the first rate comprises a silence descriptor (SID) 
frame and the second and third rates comprise normal rate frames. 

20 As stated above, a SID frame includes the normal amount of 
information but is transmitted less often than a normal frame of 
information. 

FIG. 1 generally depicts a communication system which 
beneficially implements improved rate determination in 

25 accordance with the invention. In the embodiment depicted in 
FIG. 1, the communication system is a code-division multiple 
access (CDMA) radiotelephone system, but as one of ordinary skill 
in the art will appreciate, various other types of communication 
systems which implement variable rate coding and voice activity 

30 detection (VAD) may beneficially employ the present invention. 
One such type of system which implements VAD for prolonging 
battery life is time division multiple access (TDMA) 
communications system. 

As shown in FIG. 1, a public switched telephone network 103 

35 (PSTN) is coupled to a mobile switching center 106 (MSG). As is 
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well knovsm in the art, the PSTN 103 provides wireUne switching 
capability while the MSG 106 provides switching capabiUty related 
to the CDMA radiotelephone system. Also coupled to the MSG 106 
is a contooUer 109, the controller 109 including noise suppression, 

5 rate determination and voice coding /decoding in accordance with 
the invention. The controller 109 controls the routing of signals 
to/from base-stations 112-113 where the base-stations are 
responsible for communicating with a mobile station 115. The 
CDMA radiotelephone system is compatible with Interim Standard 
10 (IS) 95- A. For more information on IS-95-A, see TIA/EIA/IS-95-A, 
Mobile Station-Base Station Compatibility Standard for Dual Mode 
Wideband Spread Spectrum Cellular System, July 1993. While the 
switching capability of the MSG 106 and the control capability of the 
controller 109 are shown as distributed in FIG. 1, one of ordinary 

15 skill in the art will appreciate that the two functions could be 

. . —1 ^^4-;4-Yr 4rir cAfcfp»Tn imnlementation. 

corriDinea in a couunuii yxijra**-*** — — - _ ^ 

As shown in HG. 2, a signal s(n) is input into the controller 
109 from the MSG 106 and enters the apparatus 201 which performs 
noise suppression based rate determination in accordance with the 
20 invention. In the preferred embodiment, the noise suppression 
portion of the apparatus 201 is a slightly modified version of the 
noise suppression system described in § 4.1.2 of TIA document IS- 
127 titled "Enhanced Variable Rate Codec, Speech Service Option 3 
for Wideband Spread Spectrum Digital Systems" published Jan. 
25 1997 in the United States, the disclosure of which is herein 
incorporated by reference. The signal s'(n) exiting the apparatus 
201 enters a voice encoder (not shown) which is well known in the 
art and encodes the noise suppressed signal for transfer to the 
mobile station 115 via a base station 112-113. Also shown in FIG. 2 
30 is a rate determination algorithm (RDA) 248 which uses 
parameters from the noise suppression system to determine voice 
activity and rate determination information in accordance with the 
invention. 

To fully understand how the parameters from the noise 
35 suppression system are used to determine voice activity and rate 
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determination information, an understanding of the noise 
suppression system portion of the apparatus 201 is necessary. It 
should be noted at this point that the operation of the noise 
suppression system portion of the apparatus 201 is generic in that it 
5 is capable of operating with any type of speech coder a design 
engineer may wish to implement in a particular communication 
system. It is noted that several blocks depicted in FIG. 2 of the 
present application have similar operation as corresponding blocks 
depicted in FIG. 1 of US Pat. No. 4,811,404 to Vilmur. As such, US 

1 0 Pat. No. 4,811,404 to Vilmur, assigned to the assignee of the present 
application, is incorporated herein by reference. 

Referring now to FIG. 2, the noise suppression portion of the 
apparatus 201 comprises a high pass filter (HPF) 200 and remaining 
noise suppressor circuitry. The output of the HPF 200 Sh/n) is used 

15 as input to the remaining noise suppressor circuitry. Although the 
frame size of the speech coder is 20 ms (as defined by IS-95), a frame 
size to the remaining noise suppressor circuitry is 10 ms. 
Consequently, in the preferred embodiment, the steps to perform 
noise suppression are executed two times per 20 ms speech frame. 

20 To begin noise suppression, the input signal s(n) is high pass 

filtered by high pass filter (HPF) 200 to produce the signal S4p(n). 
The HPF 200 is a fourth order Chebyshev type II with a cutoff 
frequency of 120 Hz which is well known in the art. The transfer 
function of the HPF 200 is defined as: 



25 



30 



where the respective numerator and denominator coefficients are 
defined to be: 

b = { 0.898025036, -3.59010601, 5.38416243, -3.59010601, 0.898024917 }, 
a = { 1.0, -3.78284979, 5.37379122, -3.39733505, 0.806448996 ). 
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As one of ordinary skill in the art will appreciate, any number of 
high pass filter configurations may be employed. 

Next, in the preemphasis block 203, the signal s^(«) is 

5 windowed using a smoothed trapezoid window, in which the first 
D samples dim) of the input frame (frame "m") are overlapped 
from the last D samples of the previous frame (frame "m-l"). This 
overlap is best seen in HG. 3. Unless otherwise noted, all variables 
have initial values of zero, e.g., dim) = 0 ; m ^ 0. This can be 

1 0 described as: 

d^m.n)=d{m-^,L + ^)} 0<n<D, 

where m is the current frame, n is a sample index to the buffer 
15 {dim)], L = 80 is the frame length, and D = 24 is the overlap (or 



20 



nioc of the inout buffer are 
delays in samples, lue iciiiH.i.xi"vj, « — ^ — 

then preemphasized according to the following: 

dim. D + n)=s^(n) + C pS/,p(n - D ^ 0 s n < L , 



where C = -0-8 is the preemphasis factor. This results in the input 
buffer containing L + D = 104 samples in which the first D samples 
are the preemphasized overlap from the previous frame, and the 
following L samples are input from the current frame. 
25 Next, in the windowing block 204 of HG. 2, a smoothed 

trapezoid window 400 (HG. 4) is applied to the samples to form a 
Discrete Fourier Transform (DFT) input signal gin). In the 
preferred embodiment, gin) is defined as: 



30 fl(n) = - 



d(m.n)sin2(n(n+05)/ 2D) 
d(m.n) 

d(m,n) sin 

0 



;0 <n <D, 
; D ^ n < L , 
; L < n < D + L. 
: D + L < n < M. 
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where M = 128 is the DFT sequence length and all other terms are 
previously deHned. 

In the channel divider 206 of FIG. 2, the transformation of 
g(n) to the frequency domain is performed using the Discrete 
5 Fourier Transform (DFT) defined as: 

where e^'^is a unit amplitude complex phasor with instantaneous 
10 radial position co. This is an atypical definition, but one that 
exploits the efficiencies of the complex Fast Fourier Transform 
(FFT). The 2/M scale factor results from preconditioning the M 
point real sequence to form an M/2 point complex sequence that is 
transformed using an M/2 point complex FFT. In the preferred 
15 embodiment, the signal G(k) comprises 65 unique channels. 
Details on this technique can be found in Proakis and Manolakis, 
Introduction to Digital Signal Processing, 2nd Edition, New York, 
Macmillan, 1988, pp. 721-722. 

The signal G{k) is then input to the channel energy 
20 estimator 209 where the channel energy estimate E,^(m) for the 
current frame^ m, is determined using the following: 

£,,(m./)= max |E^i„ . a,,{m)E,^{m -1./)+ (1 - ^^(/)_ ^ ^^^^^ 

; 0<i<N^, 

25 

where E^n = 0.0625 is the minimum allowable channel energy, 
a,^(m) is the channel energy smoothing factor (defined below), N, = 
16 is the number of combined channels, and /l(0 and /„(i) are the i"^ 
elements of the respective low and high channel combining tables, 
30 ft and /„. In the preferred embodiment, ft, and /„ are defined as: 

= { 2, 4, 6, 8, 10, 12, 14, 17, 20, 23, 27, 31, 36, 42, 49, 56 ), 
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£„ = { 3, 5, 7, 9, 11, 13, 16, 19, 22, 26, 30, 35, 41, 48, 55, 63 }. 

The channel energy smoottiing factor, a.(m), can be defined as: 



0 ;m<1. 
.45 ;m>l. 



which means that a.(m) assumes a value of zero for the first frame 
im = 1) and a value of 0.45 for all subsequent frames. Tms aiiows 

he channel energy estimate to be initialized to the unhltered 
10 channel energy of the first frame. In addition, ^.e " 

energy estimate (as defined below) should be imtiahzed to the 

channel energy of the first four frames, i.e.: 

En(m,.>max|Einit, Ech(m,0}; l^m<4, 0<i^c 

where E.. = 16 is the minimum allowable cham^el noise 

initialization energy. * (^^.^o ic 

The channel energy estimate E.(m) for the current frame is 
next used to estimate the quantized channel signal-to-noise ratio 
20 (SNR) indices. This estimate is performed in the channel SNR 



estimator 218 of HG. 2, and is determined as: 



a U) = max 



O.min 1 89, round 



0 < f < N 



I'i where E (m) is the current channel noise energy estimate (as 
I'^ed lltr), and the values of (s,) are constrained to be between 0 

Channel SNR estimate (s,), the sum of the voice 
metrics is determined in the voice metric calculator 215 using: 



30 



f=0 
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where V(k) is the k'^ value of the 90 element voice metric table V, 
which is defined as: 

5 V = { 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 7, 7, 7, 8, 8, 9, % 10, 10, 
11, 12, 12, 13, 13, 14, 15, 15, 16, 17, 17, 18, 19, 20, 20, 21, 22, 23, 24, 24, 25, 26, 27, 28, 28, 
29, 30, 31, 32, 33, 34, 35, 36, 37, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 50, 50, 
50,50,50,50,50,50,50). 

10 The channel energy estimate E^(m) for the current frame is 

also used as input to the spectral deviation estimator 210, which 
estimates the spectral deviation A£(m). With reference to FIG. 5, 
the channel energy estimate EJm) is input into a log power 
spectral estimator 500, where the log power spectra is estimated as: 



The channel energy estimate E,h(m) for the current frame is also 
input into a total channel energy estimator 503, to determine the 
20 total channel energy estimate, E,„,(m), for the current frame, m, 
according to the following: 



25 Next, an exponential windowing factor, a{m) (as a function of total 
channel energy £,„(m)) is determined in the exponential 
windowing factor determiner 506 using: 



15 



0 ^ /■ < A/c • 





30 



which is limited between a ^ and a ^ by: 



a{m) = max min {a^, cr(m)}J, 



wo 98/38631 



12 



PCT/US98/00130 



10 



^ A F ,rp the energy endpoints (in decibels, or "dB") for 
where E„ ar^d E, are the energy P ^^^^ ^^^^ 

the linear interpolation E'-^-^'^^^;^;^^^ ,,.stants are 

has the limits a. < «(m) ^ ««• l^e va ^ ^.^^^^ 

definedas: = '^iB^ould use an exponential 

with relative -"^^f "^Tyi using the above calculation, 
windowing factor of a(m = 0.745 usmg the 

The spectral deviation A^Cm) is 

spectral deviation estimator 50^^ ^^^ P ^ 

the difference between the current p 
averaged long-term power spectral estimate: 



1=0 



. _ ... .„„,e^d lonK-term power spectral estimate, 

using: 

or: 



20 



25 



30 



M this point, the sun. of the voice metrics » W. tt» .o«. 

. u current frame E„Am) ana tut: 

chapel energy '^"^'^'iXvT ir..o the update decision 

r::L:r:.arjoise^^^^^^^^^^^ 

ultimately made. l'"^^ beared Than, at step 

603, where the update flag {update Jlag) is cieare 



wo 98/38631 



13 



PCT/US98/00130 



604, the update logic (VMSUM only) of Vilmur is implemented by 
checking whether the sum of the voice metrics v(m) is less* than an 
update threshold {UPDATE_THLD). If the sum of the voice metric 
is less than the update threshold, the update counter (update_cnt) 
5 is cleared at step 605, and the update flag is set at step 606. The 
pseudo-code for steps 603-606 is shown below: 



update_flag - FALSE; 
if (vim) < UPDATE_THLD) { 
10 update Jlag = TRUE 

update_cnt - 0 

) 



If the sum of the voice metric is greater than the update 
15 threshold at step 604, update of the noise estimate is disabled. 
Otherwise, at step 607, the total channel energy estimate, Eta,(m), for 
the current frame, m, is compared with the noise floor in dB 
(NOISE_FLOOR_DB), the spectral deviation A^(m) is compared 
with the deviation threshold (DEV_THLD). If the total channel 
20 energy estimate is greater than the noise floor and the spectral 
deviation is less than the deviation threshold, the update counter 
is incremented at step 608. After the update counter has been 
incremented, a test is performed at step 609 to determine whether 
the update counter is greater than or equal to an update counter 
25 threshold (UPDATE_CNT_THLD). If the result of the test at step 
609 is true, then the forced update flag is set at step 613 and the 
update flag is set at step 606. The pseudo-code for steps 607-609 and 
606 is shown below: 

30 else if (( EJm) > NOISE_FLOOR_DB ), ( D,{m) < DEV.THLD ) { 
update_cnt = update_cnt + 1 
if ( update__cnt > UPDATE_CNT„THLD ) 
update Jlag = TRUE 
) ^ 
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AS can be seen from FIG. 6, if either of the tests at steps 607 
d 6ot ar "a^ or after the update flag has been set at step 606, 
and 609 are false, or .J^^^irr?." of the update counter is 

loeic to prevent long-term creepmg , r^rpvent 

logic to p 1^ teresis logic is implemented to prevent 

has been equal to '^'"Zl^ ^^D). In the preferred 

for the last six '^^;^^J^^^^^f^^^ ^ but any number of 

10 embodiment, six frames "^^^^ 610 is true, the 

frames may be i-Pl---*^f " / p^ess exits to the 

update counter ^ ^^^f Jj^^^^^^^ , ,L, the process exits 

next frame at step 612 If the test a p 
directly to the next frame at step 612. The pseu 



1 5 610-612 is shown below: 

if ( update_cnt == last_update_cnt ) 
hyster_cnt = hyster_cnt + 1 

else 

20 hyster_cnt -0 

last_update_cnt = update_cnt 
if ( hyster.cnt > HYSTER_CNT_THLD ) 
update_cnt = 0. 



25 



h> the preferred embodlmer,.. .he values of the previously used 
constants are as follows: 



UPDATE_THLD = 35, 
NOISE_FLOOR_DB = 10log,„(l), 

30 DEV_THLD = 28, 

UPDATE_CNT_THLD = 50, and 

HYSTER_CNT_THLD = 6. 



35 



Whenever the update flag at step 606 is set for a S^-ri frame 
the channel noise estimate for the next frame is updated. 
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channel noise estimate is updated in the smoothing filter 224 

using: 

E„(m +1./)= max {E^i„ , a„E„(m./) + (l - a„)E^f,(mj')}; 0^i<N^, 

5 

where = 0.0625 is the minimum allowable channel energy, and 
cx^ = 0.9 is the channel noise smoothing factor stored locally in the 
smoothing filter 224. The updated channel noise estimate is stored 
in the energy estimate storage 225, and the output of the energy 

10 estimate storage 225 is the updated channel noise estimate E„(m). 
The updated channel noise estimate E«(m) is used as an input to 
the channel SNR estimator 218 as described above, and also the 
gain calculator 233 as will be described below. 

Next, the noise suppression portion of the apparatus 201 

15 determines whether a channel SNR modification should take 
place. This determination is performed in the channel SNR 
modifier 227, which counts the number of channels which have 
channel SNR index values which exceed an index threshold. 
During the modification process itself, channel SNR modifier 227 

20 reduces the SNR of those particular charmels having an SNR index 
less than a setback threshold (SETBACK_THLD), or reduces the 
SNR of all of the channels if the sum of the voice metric is less 
than a metric threshold (METRIC_THLD). A pseudo-code 
representation of the channel SNR modification process occurring 

25 in the channel SNR modifier 227 is provided below: 

index _cnt = 0 
for ( I = Nm to N, - 1 step 1 ) { 

if (a,(i) > INDEX^THLD ) 
30 indexjcnt = index _cnt + 1 

) 

if ( indexjcnt < INDEX_CNT_THLD ) 
modify Jlag = TRUE 

else 
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modify _flag = FALSE 



if { modify Jlag ==TRUE) 
5 for(i = Oto N.-lstepl) 

if (( vim) < METRIC_THLD ) or «T,(0 < 

SETBACK_THLD )) 



10 



else 

<t;(/)= <J,(') 



else 



Stored locally m i»c — gj^j^ threshold 

representation of the process performed m the SNR 

block 230 is provided below: 



20 

for ( « = 0 to - 1 step 1 ) 
if ((Tp(')<«J,».) 

else 

In the preferred embodiment, the previous constants and 
thresholds are given to be: 



30 Nm = 5, 

INDEX_THLD = 12, 
INDEX_CNT_THLD = 5, 
METRIC.THLD = 45, 
SETBACK_THLD = 12, and 
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= 6. 



At this point, the limited SNR indices [a^') are input into 
the gain calculator 233, where the channel gains are determined. 
First, the overall gain factor is determined using: 



I 1 

7„ = max^rmin»-"'0*O9io' 



1 



- ftoor |s,o 



where = -13 is the minimum overall gain, E^^ = 1 is the noise 
10 floor energy, and E„(m) is the estimated noise spectrum calculated 
during the previous frame. In the preferred embodiment, the 
constants and E^^ are stored locally in the gain calculator 233. 
Continuing, channel gains (in dB) are then determined using: 

1 5 yrfe(/)= <^tt. )+ rn ; o<i<N,, 

where ^ig = 0.39 is the gain slope (also stored locally in gain 
calculator 233). The linear channel gains are then converted using: 

20 y^(0 = min{l,10''-«*'*^'°); 0</<A/^. 

At this -point, the channel gains determined above are 
applied to the transformed input signal G(/c) with the following 
criteria to produce the output signal H{k) from the channel gain 
25 modifier 239: 

^rc^(/)G(/c) : fL(i)^f<^fH{ihO<i<Na. 
^ ' I G{k) : otherwise. 

The otherwise condition in the above equation assumes the 
30 interval of Jt to be 0 ^ A: < M/2. It is further assumed that the 
magnitude of Hik) is even symmetric, so that the following 
condition is also imposed: 
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H(M-it)=H»(k); 0 < Jc <M/2 

^^r^lPx coniueate. The signal H(fc) is then 
where the * denotes a complex conpag ^^.^^^ 
converted (back) to the time domain m the chann 
by using the inverse DFT: 



15 



20 



25 



xr^^ . . /,,v-/2«>*/ M . 0 < n < M 



^ ((=0 



,0 .a *e ''^X^T:;^^-^^ 

the output signal h (n) hy applying r 



following criteria 



30 



^•(n)=[ t,(m.n) .,M-L^n<L. 

suppressed: 

^ - 0.8 i. a alphas. rac« locaUy wUhin .he 

aee.p^is b^a«.^ no.e suppression Po^^^/Jf^^ 

apparats ao. . a sUs^ ^rao:!" -^n; 

suppression system aescnbed m § ^.^^ Option 3 for 

W,.e..„. SpreM ^^^^^J^^ StaaduLaUy shown in 
detemunalion algorithm (RDA) Wock ^^.^ 

HG. 2 as is a P^'''-*-"- JJf ° ^':f^te no'ise estimate from 
peak-.<.average ratio ^^^ f ^^^^,, .he transmission 

being updated aunng "tonal signals. 
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of sinewaves at Rate 1 which is especially useful for purposes of 
system testing. 

Still referring to FIG. 1, parameters generated by the noise 
suppression system described in IS-127 are used as the basis for 
5 detecting voice activity and for determining transmission rate in 
accordance with the invention. In the preferred embodiment, 
parameters generated by the noise suppression system which are 
implemented in the RDA block 248 in accordance with the 
invention are the voice metric sum v{m), the total channel energy 

10 E,^,(m), the total estimated noise energy E,„(m), and the frame 
number m. Additionally, a new flag labeled the "forced update 
flag" (fupdatejlag) is generated to indicate to the RDA block 248 
when a forced update has occurred. A forced update is a 
mechanism which allows the noise suppression portion to recover 

15 when a sudden increase in background noise causes the noise 
suppression system to erroneously misclassify the background 
noise. Given these parameters as inputs to the RDA block 248 and 
the *'rate" as the output of the RDA block 248, rate determination 
in accordance with the invention can be explained in detail. 

20 As stated above, most of the parameters input into the RDA 

block 248 are generated by the noise suppression system defined in 
IS-127. For example, the voice metric sum vim) is determined in 
Eq. 4.1.2.4-1 while the total channel energy Etot(m) is determined in 
Eq. 4.1.2.5-4 of IS-127. The total estimated noise energy E(„(m) is 

25 given by: 

EJm) = lOlog,/f^EM^i)) 

■ i=0 

which , is readily available from Eq. 4.1.2.8-1 of IS-127. The 10 
30 millisecond frame number, m, starts at 77i=l. The forced update 
flag, fupdatejlag, is derived from the "forced update" logic 
implementation shown in §4.1.2.6 of 15-127. Specifically, the 
pseudo-code for the generation of the forced update flag, 
fupdatejlag, is provided below: 
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/* Normal update logic */ ^ 

update Jlag = /"Pd^^^-g^ 
if(p(m)$UPDATE^THLD){ 

update Jlag = TRUE 
update j:ni = 0 



/* Forced update logic */ a ( \ ^ DEV THLD ) 

else if (( E.(m) > NOISE FLOOR_pB ) and ( A.(m) DEV.im 

10 and ( sinewavejlaz ==/7^^,„;Vi 

update cm =;^P^^1^%; J:nT THLD ) 
if ( update cut S UI UAiti_v-i>"^_ tpitp 



15 . is set TRUE when the spectral peak-to- 

*an 10 dB and *s special deviation 
..S.tSs .Ka„ DBV,«La .a.ed di«e.enU. 

fTRUE; A,(n«) < DEV _THLDand ^(m) > 10 
20 sinewai;e_/Iflg = | FALSE; otherwise 



where: 



in ax(Erf,(m)) v 
A(m) = lOlogioC-NTH ' 



i=0 



" is .he pea.-i.ave.ge ««o deie^ined in ;t 
bio* 251 and E..(m) is the channel energy estimate g 

''•"Sr;h!'wopria.e inputs have been Se— d, rate 
*ln the RDA block 248 can be perfornted in 
30 determination wrthrn the RDA ^^^^ 

a— tnc^ he modlL total energy ..(™. >s 

diagram depicted in fi^^- 

given ais: 

f56dB; m<4 orwpdate_/Iag = TRUE 
35 E'.^(m) = | E^_(;„); otherwise 
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15 



Here, the initial modified total energy is set to an empirical 56 dB. 
The estimated total SNR can then be calculated, at step 703, as: 

This result is then used, at step 706, to estimate the long-term peak 
SNR, SNRpim), as: 



10 SNR^(m) = < 



SNR; SNR>SNR^im'}) or update _ flag = TRUE 
0.99SSNR^{Tn - 1) + 0.0Q2SNR; SNR > 0375SNR^im - 1) 
SNRAm - 1); otherwise 



where SNRp(O) = 0. The long-term peak SNR is then quantized, at 
step 709, in 3 dB steps and limited to be between 0 and 19, as 
follows: 

SNRq = msLx[min[[SNR^(n7) I sj , 19} , o} 



where [x J is the largest integer < x (floor function). The quantized 
SNR can now be used to determine, at step 712, the respective voice 
20 metric threshold Vtn, hangover count K^u and burst count 
threshold btn parameters: 

v,,^v^^[SNR^\ h^,^K^\SNR^\ b^^b,,,\SNRj[ 

25 where SNRq is the index of the respective tables which are defined 

as: 

nahle = { 37, 37, 37, 37, 37, 37, 38, 38, 43, 50, 61, 75, 94, 118, 146, 178, 216, 258, 306, 359 ) 
Stable = { 25, 25, 25, 20, 16, 13, 10, 8, 6, 5, 4, 3, 2, 1, 0, 0, 0, 0, 0, 0 ) 
30 htable = { 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 7, 6, 5, 4, 3, 2, 1, 1, 1 ) 

With this information, the rate determination output from 
the RDA block 248 is made. The respective voice metric threshold 
Vth, hangover count hcnt, and burst count threshold btn parameters 
35 output from block 712 are input into block 715 where a test is 
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de..nmn«J "^^^ ^ * " noise suppression sys.em 

5 rejr ilis":: ^ice .e^ic *res.oia whicH varies 

Kefernng .o s«p 7 5 ^^^^ ^, .„ 

less *an the vo.ce metric thtMn . ^ „,e. 

wMch to „ansmi. ^^J^L^ ef ^^—d a. s.p 721. 
10 Atar this determmanon, a hangov P ^^^^^^ 

—irri :ti«ed as noise, or .o 
decaying speech that nugn. aggressive 

- - r — :::>Tr.'l.^.Tran^ remitted .o_.he 
— ^.elohiiesta^o^ 

signals that are close .o .he noise ^"^^ „i.ue 
wKich has *e advantage °' is no. 

mainlining high voice qualily. M ™ ^, 724, ,he 

„ g.ea.er *a„ .he wei.h.. ^ hr^ld 

. process flows .0 step 727 wl^erefce .^^ 

signal s'(n) is ^^"'^^^^^^^ ^'L^ voice metric threshold a, 
n«tric, vim), is greater than the ,„ „hich 

30 " 1 (transmission at l/2rate via 

toown as full rate). In p^s flows 

step 727 or transmission at full rate vi F hangover is 

,0 step 73. where a hangover is d«e,™ned^ A^y he ^^^^ 

determined, the P--^/°": "l^rL^ignal s (n, is coded 
35 transmission is guaranteed. A. this point, tne g 
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at either 1 /2 rate or full rate and transmitted to the appropriate 
mobile station 115 in accordance with the invention. 

Steps 715 through 733 of FIG. 7 can also be explained with 
reference 'to the following pseudocode: 

5 

if ( vim) > Vut) I 

if ( vim) > avo^ ) [ 

rateim) = RATEl 

) else { 

10 rateim) = RATEl /2 

} 

bim) = Mm-1) + 1 
counter */ 

if ( bim) >b,,){ 
1 5 with threshold */ 

him) = hc„t 

) 

) else { 

Mm) = 0 

20 V 

him) = him-1) -1 
hangover */ 

ifihim)<0){ 

rateim) =. RATEl /8 
25 him) = 0 

] else { 

rateim) = rateim-l) 

} 

} 

30 

The following psuedo code prevents invalid rate transitions as 
defined in IS-127. Note that two 10 ms noise suppression frames 
are required to determine one 20 ms vocoder frame rate. The final 
rate is determined by the maximum of two noise suppression based 
35 RDA frames. 

if ( rateim) == RATEl /8 and rateim-2) == RATEl ) { 
rateim) = RATEl /2 

) 

40 



/*a=l.l V 

/* increment burst 
/* compare counter 
set hangover */ 

/* clear burst coimter 
/* decrement 
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While the invention has been P"^^^^^^^^ 
described with reference to a particular embodiment, it wi 1 be 
uXlod by those skilled in the art that various changes m form 
tafls may bemade therein without departing from the spmt 
5 rpe of the invention. Por example, the ^PP-us useful .n 

— Tdrof^'—a ti^^^ syU, but one of 
o^dLy skiU in the art will appreciate that the ^PP^^! ^^^^^ 
1 0 ^uld likewise be implemented in the mobile station 115. n th.s 
' Tpfementation, no changes are required to FIG. 2 to implement 
rate determination in accordance with the invention. 

Also, the concept of rate determination m -^^-^ ^^^^ 
the invention as described with specific reference to a CDMA 
the invention extended to voice activity detection 

'^^^TlTrn^is^ -ItiP- access (TOMA) 
:;;mu;;:ca;;;:nTystem in accordance wUh the inver.K,n. J. tms 
i„.plementation, the functionality of the RDA ^^^^/^^^ J^^^^; 

Zaced with the functionality of voice activity detection (VAD) 
replaced witn ^ ^^^.^.^^ ^^^^^ 

20 where the output of the VAD blocK formed to 

likewise input into the speech coder. The P ? 
determine whether voice activity exiting the VAD Wock 248 
TRUE or FALSE is similar to the flow diagram of HG. 7 and 
IKUti or r^i^ . CTr- ft thP steos 703-715 are the same 

shown in FIG. 8. As shown m FIG. 8, the ^^PS 
25 as shown in FIG. 7. However, if the test at step 715 is false, then 
25 as snown , cat cp a* steo 818 and the flow proceeds 

\r An u determined to be FALSE at step oio r 

is true, then VAD is determined to be TRUE at step 

^ . v-^-^t where a hangover is determined, 
flow proceeds to step 733 where a nai g 

,n The corresponding structures, materials, acts and 

ecuivJl of all means or step plus function elements in he 
daims below are intended to include any ^^^^^^^^'^^^^^^^^ 
for performing the functions in combination with other claimed 
elements as specifically claimed. 
3 5 What I claim is: 
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Claims 

1. An apparatus for determining transmission rate in a 
commuriiration system, the apparatus comprising: 

5 

a noise suppression system for suppressing background 
noise in a signal input to the noise suppression system, the noise 
suppression system generating parameters related to the 
suppression of the background noise; and 
1 0 rate determination means, having as input the parameters 

generated by the noise suppression system, for generating 
transmission rate information for use by a speech coder. 

2. The apparatus of claim 1, wherein the noise suppression 
1 5 system is substantially a noise suppression system as defined in IS- 

127. 

3. The apparatus of claim 1, wherein the parameters generated 
by the noise suppression system include a control signal which 

20 allows the noise suppression system to recover when a sudden 
increase in background noise causes the noise suppression system 
to erroneously misclassify background noise. 
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4 An apparaws <or determining transmission rate in a 
communicaUm system, the apparatus compnsmg: 

means for estimattng the channel energy in a current frame 

' '"""havinS as input the estimat«l *-ei - 

determining the difference betw^ *L"tr.^^gTo" a P.ralify 

--;:roT;:;:^rr;rTt::^hanner.ergy 

>° —:Ltr^—;a voice metric Wdones.ima.es o, 

i.* fVif> riirrent frame of information; 
'^'~:r;^X^^- -mated noise energy Msed on 

,5 "tetlf fH— n^t- rate of transmission of the frame 

' = ... ^ "I L^ „n .he total channel energy estimate for the 
fr°l; *e"voice- metric and the .otal es.ima.eU notse 

energy. 

,0 5 The apparatus of claim 4. further comprising means having 

I input the^tL ci^« — — r; t 

types of signals are present. 

^ Th. apparatus of claim 5, wherein the control signal 
prevents a Xlmate from being updated when tonal signals 
30 are present. 

c \ 'rr. ^ wherein the control signal which 
rj TViP aDoaratus of claim 5, wnereui uic o 

35 rate lor purposes of testing the communicatton system. 
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8. A method of determining transmission rate for a frame of 
information in a communication system, the method comprising 
the steps of; 

5 determining a first voice metric threshold from a peak 

signal-to-noise ratio of a current frame of information; 

comparing a voice metric to the first voice metric thresholds- 
transmitting the frame of information at a first rate v/hen 
the voice metric is less than the first voice metric threshold; 
1 0 comparing the voice metric to a second voice metric 

threshold when the voice metric is greater than the first voice 
metric threshold; 

transmitting the frame of information at a second rate when 
the voice metric is less than the second voice metric threshold; and 
1 5 transmitting the frame of information at a third rate when 

the voice metric is greater than the second voice metric threshold. 

9. The method of claim 8, wherein the first rate comprises 1/8 
rate, the second rate comprises 1/2 rate and the third rate comprises 

20 full rate in a code-division multiple access (CDMA) 
communication system as defined in IS-95. 

10. The method of claim 8, wherein the first rate comprises a 
silence descriptor (SID) frame and the second and third rates 

25 comprise normal rate frames in a time-division multiple access 
(TDMA) communication system. 



PCTAJS98y00130 



WO 98/38631 

//7 




wo 98/38631 



PCT/US98/00130 




PCTAJS98/00130 



- WO 98/38631 

3/7 




wo 98/38631 PCT/US98/00130 



4/7 



210 



TOTAL 
CHANNEL 
ENERGY 
ESTIMATOR 



503 



500 



LOG POWER 
SPECTRAL 
ESTIMATOR 



Etot('") 



EXPONENTIAL 1-506 
WINDOWING 

FACTOR 
DETERMINER 



(m) 



509 



^db(m) 



> 



512 



LONG-TERM 

POWER 
SPECTRAL 
ESTIMATOR 



SPECTRAL 

DEVIATION 

ESTIMATOR 



^db(m) 



F IG.5 



PCTAJS98/00130 



WO 98/38631 

5/7 




wo 98/38631 PCTAJS98/00130 



6/7 



CALCULATE TOTAL SNR L-ynj 
FOR CURRENT FRAME 1 

T 

ESTIMATE PEAK SNR h706 

r ' 

QUAKTIZE PEHK SNR T ^Og 
i 

DETERMINE VOICE 
METRIC THRESHOLD 
FROM QUANTIZED 
PEAK SNR 



F IQ.7 




DETERMINE HANGOVER | 



GUARANTEE VALID 



•7J6 



RATE TRANSITION V 

r \ 



PCT/US98/00130 



WO 98/38631 

7/7 




CALCULATE TOIAL bjlR V 703 
FOB niRREMT FRAME J 

ESTIMATE PEAK SNR"> 7Q6 
QUANTIZE PEAKSNRf 709 



r DETERMINE VOilL 1 

I 7rOm" QUANTIZED | 
PEAK SNR ^ 



719 




72tHmEMENT HANGOVER 



F IG.8 



INTERNATIONAL SEARCH REPORT 



(oten)K.tionaJ BppliaaUon No. 
PCT/US98rt)OI30 



A. CLASSIFICATION OF SUBJECT MATTER 
1PC(6) :G 1 0^9/00 
US CL :704/223 

A«cortlinB *o In teraational Pmtent ClassiConU'an (IPC) of to bftth n>oon»l clogyifieation and IPC 

a FIELDS SBARCHEP - 

Minimum documeatatinn luwrehwd (diusiCcatioa system rotlowcd by clonriGcadon symbols) 
U.S. : 704>223. 7W/2IL 704« 14.704/223, 704/500 

Documencation ocamhed other than minimum documentation to the: cKtcni that su«h docUmcnU arc included in the fields searched 
MAYA Search was done 

ElcciKinici data base consulted dunng die Iniemaltunsl search (name of data base and. where praeticttbte. search terms ufoA} 



DOCUMENTS CONSIDERED TO DC RELEVANT 



CatBgoiy' 


Citatinh of Jocumont with indication, whore appropriate, of the reUvant ptmsogcs 


Relevant to olaim No. 


Y 


US 4,811,404 A (Vilmur et al.) 07 March 1989, Abstract. 


1-10 


Y 


US 5.410,632 A (Hong et ah) 25 April 1995, Abstract. 


1-10 


Y 


US 5,657,420 A (Jacobs et al,) 12 August 1997, Abstract. 


1-10 


Y 


US 5,687,243 A (McLaughlin ct ah) 11 November 1997, Abstract. 


1-10 


X 


Enhanced Variable Rate Codec, Speech Service Option 3 for 
Wideband Spread Spectrum Digital Systems, TS-127» 09 September 
1996. Section 4. 


1-10 


1 I Further documenK aro listed in the ooniinuation of Box C. [ | See patent UmWy annex. 



* Special «lrBoricf oT cited <lt(CUniBi(li. 

'A* documctll daTmiiq} Uia general ilnic <tf th« «RWkiclt u i^t vdii»iJ«i«d 

to be of p«rt4CuLr (mUwmnBu 

'B* airher daeument ptibllihcd o» or afler (1m uiiemadoiu) Hltngt Wat* 

't.* doetincm ^Uidk niojr Uitdw doubti on priAWty elaim(») or wbtch is 

eii«4 to Mtablbh Uk« ptfbtiEolion data aT •yw^lm- citBliott of oth«r 
tpAoial roaactn ■pccifWd) 

O' doc«Bi«iu ier«mng to »n oft I dHcloaurA, u«v. KnhtbHion or otli«r 

P* dodumcnt puhlwhcd pri«r 10 th« intantiiltdfMl I'llms dnta but Uter dt*» 



Utcr (tocum«ni publislkcU aller Ihs [ni*mationttl filins data or ftrUrity 
dal« *tiA not id oanJllcl villi th« applieatian but cited to uinlctctsnd 
lbs pntioipl« «!ir theory underlying Um inveotien 

doeumsnl of pitrttenltr r«l*vi)nc4-. Uta claimed liiv«in(ian cnnnai ba 
conildcrcd nnal or Dtnnoi b« onnttderMI to involva an ifrronti-tf e step 
whan the doeuatani it ia\:ta» tilnna 

doauntBiil af panJcular f^awanoa; lha etmlmed Invetiiion csnnot be 
imiuidand to tftvolva an cuventive atop "ban the <toeuineni U 
coBlbinad vrth one or mnr* odter ausb doeiunmis, mvtoH cotnbbl%ln)u 
bctnS oOricRlJ to a penm ikilted {fi lha on 

domttnam namber o( Uie mnc pKctil fbtnlly 



Dale oflhc actual cnmplalion of the ihtemalional ncarch 
31 MARCH 1998 


Date of mailing of th6 international search repnrl 

1 4AUG 1998 


Wiinie and mailing address of the I.SA/1 IS 
Comroisioner of PalcrtU «nd Tmdeninrlu 

Bo^ per 

WuhinRton. D.C. 20331 
facsimile No. (703) 305O230 


Authnnzed officer ; 

»i^SUSANWlELAND L^l^^vk.., 3.1^1 
Telephone No. {703)308-6593" \ 



Form PCT/I$A/210 (second iheotXJuly J993)< 



\ 

1 



This Page Blank (uspto) 



/' 



